Sentence Level Machine Translation Evaluation as a Ranking Problem: one step aside from BLEU
نویسندگان
چکیده
The paper proposes formulating MT evaluation as a ranking problem, as is often done in the practice of assessment by human. Under the ranking scenario, the study also investigates the relative utility of several features. The results show greater correlation with human assessment at the sentence level, even when using an n-gram match score as a baseline feature. The feature contributing the most to the rank order correlation between automatic ranking and human assessment was the dependency structure relation rather than BLEU score and reference language model feature.
منابع مشابه
Sentence Level Machine Translation Evaluation as a Ranking
The paper proposes formulating MT evaluation as a ranking problem, as is often done in the practice of assessment by human. Under the ranking scenario, the study also investigates the relative utility of several features. The results show greater correlation with human assessment at the sentence level, even when using an n-gram match score as a baseline feature. The feature contributing the mos...
متن کاملAMBER: A Modified BLEU, Enhanced Ranking Metric
This paper proposes a new automatic machine translation evaluation metric: AMBER, which is based on the metric BLEU but incorporates recall, extra penalties, and some text processing variants. There is very little linguistic information in AMBER. We evaluate its system-level correlation and sentence-level consistency scores with human rankings from the WMT shared evaluation task; AMBER achieves...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملRegression and Ranking based Optimisation for Sentence Level Machine Translation Evaluation
Automatic evaluation metrics are fundamentally important for Machine Translation, allowing comparison of systems performance and efficient training. Current evaluation metrics fall into two classes: heuristic approaches, like BLEU, and those using supervised learning trained on human judgement data. While many trained metrics provide a better match against human judgements, this comes at the co...
متن کاملHybrid Machine Translation Using Joint, Binarised Feature Vectors
We present an approach for Hybrid Machine Translation, based on a Machine-Learning framework. Our method combines output from several source systems. We first define an extensible, total order on translations and use it to estimate a ranking on the sentence level for a given set of systems. We introduce and define the notion of joint, binarised feature vectors. We train an SVM-based classifier ...
متن کامل